The final work is the Rmd file named “data-analysis-replication.Rmd.” The html file is just committed for reference.
Before we get started:
Here are some “housekeeping” chunks.
In this way, the packages needed should be installed before the require() function. I deleted some of the packages that I thought I would use but I didn’t. Just in case: the earlier full list is c(“tidyverse”, “dplyr”, “mosaic”, “ggplot2”, “infer”, “skimr”, “cowplot”, “broom”, “car”, “jtools”, “naniar”, “MuMIn”, “lme4”, “lmerTest”, “lmtest”, “broom.mixed”)
pkg <- c("tidyverse", "dplyr", "ggplot2", "lme4", "broom.mixed")
not_installed <- pkg[!pkg %in% rownames(installed.packages())]
if (length(not_installed) > 0) install.packages(not_installed)
lapply(pkg, require, character.only = TRUE)
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: lme4
##
## Loading required package: Matrix
##
##
## Attaching package: 'Matrix'
##
##
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
##
## Loading required package: broom.mixed
## [[1]]
## [1] TRUE
##
## [[2]]
## [1] TRUE
##
## [[3]]
## [1] TRUE
##
## [[4]]
## [1] TRUE
##
## [[5]]
## [1] TRUE
This is a journal article written by three researchers from University of Amsterdam, the Netherlands (Broedelet et al., 2023). It is named “Broedelet et al., 2023” in the repository/folder. Hereby, all “section” and “subsection” refers to the portions in the original article, whereas “Figure” and “Table” refer to the ones with titles in this Rmd/html file.
The study asks two major research questions.
Here are some terms to clarify (from both the article and me):
DLD: DLD is a communication disorder that interferes with learning, understanding, and using language. More information can be found here https://www.nidcd.nih.gov/health/developmental-language-disorder. The descriptive picture for the causes and symptoms for this disorder is not yet very clear, so a wide variety of researches focuses on the population with DLD and study their abilities across all linguistic domains and other related areas, in most of the cases, in comparison to typically developing (TD) children. In this paper, their object was to study the role (“nature and extent”) of statistical learning in lexical-semantic development. In the field of Linguistics, how humans acquire their first languages is also one of the biggest puzzles. Studies on pathology and populations with language impairment, in turn, throw light on the question for those who are interested in this topic.
Lexical-semantics: Semantic is a linguistic domain of language meanings, and lexical-semantics is a sub-field of word meanings.
Distributional learning (distributional cues/visual distributional learning): Some think to recognize is to categorize (e.g., Harnad et al., 2005). “Distributional learning is a form of statistical learning and entails the learning of categories based on the frequency distribution of variants in the environment.” The article specifies “a visual distributional learning task based on Junge et al. (2018) to test novel object categorization in children with and without DLD.”
Two txt format data files were published by the researchers. I saved two files as csv files. Each file targets at one research question.
df_Q1 <- read_csv("./data/CAT_Test.csv", col_names = TRUE)
## New names:
## Rows: 400 Columns: 20
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (10): Condition, StimRef, StimLeft, StimRight, Target, AnswerStim, Posit... dbl
## (10): ...1, Subject, Trial, Item, ACC, RT, AnswerStimD1, AnswerStimD2, M...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
df_Q2 <- read_csv("./data/CAT_regr.csv", col_names = TRUE)
## Rows: 25 Columns: 14
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (14): Subject, DLMeanAcc, Age_months, ActVocab_raw, DigitSpanFW_raw, Dig...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Now we have some ideas on most of the variables involved to answer RQ1 and RQ2. The ones involved my replication are:
RQ1:
Group: Researchers got a final sample of 25 children
with DLD. 25 TD children were matched from a larger sample from previous
studies where data were already collected. The children with DLD “were
recruited via different institutions in the Netherlands;” the TD
children were recruited from two primary schools. Their recruitment
criteria are quite standard to my knowledge. For more details, please
see subsection Participants under section
Methodfrom the article. This is a categorical
variable
Condition: Figure 1 demonstrate the experimental
design. Participants will get familiarized with a continuum of tokens.
The frequency distribution of each token are plotted in Figure 1. Each
participant will be assigned to either Condition 1 (Blue) or Condition 2
(Orange). Token 4, Token 6 and Token 8 were named D1 (Deviant 1), S
(Standard) and D2 (Deviant 2). For a participant who is sensitive to
frequency distributional cues, they will choose that D1 and S are more
alike if assigned Condition 1, and S and D2 are more alike if assigned
Condition 2, due to the different arrangement of the category boundary.
This is also a categorical variable.
Age_months: This variable is inherently treated as a
predictor variable in this type of study. In this study, it is treated
as a continuous variable.
AnswerStimD2: This is the dependent variable, i.e.,
choice made by participants in the tasks after familiarization with
either condition. This variable, relative to Condition,
depicts the distributional learning ability. This variable is in the
format of binomial counts. The nature of this variable determines that
for RQ1, I will likely choose glmer(family = "binomial")
for the inferential analysis replication.
PositionD2: They suspect when participants made the
choice, whether D2 token was presented on the left or right (see Figure
2) on the screen might affect their choice preference, so they added
this categorical variable as the random factor.
Subject: This vairable is inherently treated as a
random factor in this type of study.
The head() function allows us to get an overview of the dataset structure and examine all the variables as the column names
head(df_Q1)
## # A tibble: 6 × 20
## ...1 Subject Condition Trial Item ACC StimRef StimLeft StimRight Target
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
## 1 1 501 Condition 1 2 1 0 S D2 D1 D2
## 2 2 501 Condition 1 3 7 0 S D2 D1 D2
## 3 3 501 Condition 1 4 5 0 S D2 D1 D2
## 4 4 501 Condition 1 6 6 0 S D1 D2 D2
## 5 5 501 Condition 1 7 2 0 S D1 D2 D2
## 6 6 501 Condition 1 9 8 0 S D1 D2 D2
## # ℹ 10 more variables: RT <dbl>, AnswerStim <chr>, PositionD2 <chr>,
## # AnswerStimD1 <dbl>, AnswerStimD2 <dbl>, MeanAccPP <dbl>, Gender <chr>,
## # Group <chr>, Age_months <dbl>, Age_ym <chr>
RQ2:
DLMeanAccuracy: Different from RQ1, they use
accuracy (range from 0-1) based on participants’ choices to represent
the distributional learning ability (e.g., choosing D1 in a test
question if assigned Condition 1 is considered accurate). It’s a
transformation of earlier data (Condition, AnswerStimD1 and
AnswerStimD2).
Children with DLD’s scores on various tasks: The study did many multiple regression analyses on multiple areas of abilities with different components (scores).
head(df_Q2)
## # A tibble: 6 × 14
## Subject DLMeanAcc Age_months ActVocab_raw DigitSpanFW_raw DigitSpanBW_raw
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 701 0 88 21 5 3
## 2 702 1 98 8 5 4
## 3 703 0.125 104 38 8 4
## 4 705 1 98 33 7 2
## 5 706 0.375 93 21 5 2
## 6 707 0.75 111 36 5 4
## # ℹ 8 more variables: DigitSpanTot_raw <dbl>, NWR_raw <dbl>,
## # PassVocab_raw <dbl>, NonVerbInt_raw <dbl>, WordAss_raw <dbl>,
## # WordCatEx_norm <dbl>, WordCatRec_norm <dbl>, WordCatTot_norm <dbl>
For RQ1, they did three major analyses other than descriptive data presentation and visulizations. They are: “Split-half reliability distributional learning task,”Group comparison distributional learning task” and “Exploratory results.” The results are listed in several sub-sections under section Results. I only replicate the descriptive data visualization for RQ1 (participants’ choices under different conditions) and the second analysis, which is the most essential analysis to my judgment. In the first analysis, they concluded that the task is a reliable test. In the second analysis, they found that “school-aged children can learn novel visual object categories based on distributional properties,” but they couldn’t conclude “whether children with DLD do or do not have a distributional learning deficit.” I will replicate this analysis in addition to the visualization. In the third analysis, they only include children with DLD but couldn’t conclude whether children with DLD are able to learn novel visual object categories based on distributional information.
For RQ2, their dataset file only include children with DLD and their performance across multiple areas. They present the results of “four separate multiple linear regression analyses in R to test the relationship between distributional learning and different measures of vocabulary.” as well as some descriptive data/visualizations in subsection Regression analyses in section Results. I only replicate the descriptive data analyses and visualizations.
Here is the visualization for RQ1.
I recoded the column AnswerStimD2 to enable
y = count in ggplot.
df_Q1_visual <- df_Q1 %>%
mutate(AnswerStimD2 = case_when(AnswerStimD2 == 0 ~ "D1",
AnswerStimD2 == 1 ~ "D2")) %>%
group_by(Group, Condition, AnswerStimD2) %>%
summarise(count = n())
## `summarise()` has grouped output by 'Group', 'Condition'. You can override
## using the `.groups` argument.
Here is my replication for Figure 3.
ggplot(df_Q1_visual, aes(x = factor(Condition), y = count, fill = factor(AnswerStimD2))) +
geom_bar(stat="identity") +
labs(x = "Condition", y = "Count", fill = "Answer") +
theme_minimal() +
facet_wrap(~Group)
Figure 3 is replicable and we can see that choice counts for D1 and D2 depending on condition and group are same in two figures. From the visualization, at least I have a sense that school-age children do make some different choices (i.e. being sensitive to ) based on conditions.
DLMeanAccuracy Here is the visualization for
DLMeanAccuracy.Here is my replication for Figure 4.
ggplot(df_Q2, aes(x=DLMeanAcc)) +
geom_histogram(position="dodge")+
theme(legend.position="top")+
labs(x = "Accuracy", y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Again, we see Figure 4 is replicable. One take-away message from Figure 4 is that the DLD group may not perform as bad as we might expected.
Here is my replication:
For a better visualization, I rename the column names to make the output more readable.
df_Q2_desc <- df_Q2
colnames <- c("Subject", "Accuracy", "Age months", "Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total")
colnames(df_Q2_desc) <- colnames
Then I use pivot_longer() to enable summarise(). Column names go under “Task” and test scores go under “Score.”
df_Q2_desc <- df_Q2_desc %>%
pivot_longer(cols = c("Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total"), names_to = "Task", values_to="Score")
The columns of the tables I summarised are mean, standard deviation and the ranges following the column names of Table 1 and Table 2. Some elements in Table 1 and Table 2 are missing. For example, age-equivalent scores are corresponding scores on the test manuals (or other form of the norms) to look up, and those data were not included in the shared dataset. For almost all test scores, raw scores were used in further regression analyses, so those data are available in the dataset, but for there are three exceptions as explained in the original paper, For those three scores, norm scores instead of raw scores are available. I added one column in the replication to specify what type of score is available. Data from two tables (Table 1 and Table 2) are included. I added one column to specify the test category (vocabulary or control).
df_Q2_desc_sum <- df_Q2_desc %>%
group_by(Task) %>%
summarise(M = mean(Score), SD = sd(Score), Range = paste0(min(Score), "..", max(Score))) %>%
mutate(Score_type = case_when(Task %in% c("Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations") ~ "Raw",
Task %in% c("Word classes receptive", "Word classes expressive", "Word classes total") ~ "Norm"))%>%
mutate(Task_type = case_when(Task %in% c("Productive vocabulary", "Receptive vocabulary", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total") ~ "vocabulary",
Task %in% c("Raven’s progressive matrices", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition") ~ "control")) %>%
arrange(Task_type, Score_type)
df_Q2_desc_sum
## # A tibble: 11 × 6
## Task M SD Range Score_type Task_type
## <chr> <dbl> <dbl> <chr> <chr> <chr>
## 1 Digit Span Backwards 2.72 1.02 0..4 Raw control
## 2 Digit span forwards 5.36 1.58 3..9 Raw control
## 3 Digit span total 8.08 1.91 4..12 Raw control
## 4 Non-word repetition 3.36 2.36 0..9 Raw control
## 5 Raven’s progressive matrices 23.2 7.41 11..38 Raw control
## 6 Word classes expressive 7.24 2.63 3..12 Norm vocabulary
## 7 Word classes receptive 6.88 2.71 1..13 Norm vocabulary
## 8 Word classes total 6.88 2.60 2..13 Norm vocabulary
## 9 Productive vocabulary 28.2 8.94 8..41 Raw vocabulary
## 10 Receptive vocabulary 90.5 13.0 70..119 Raw vocabulary
## 11 Word associations 23.9 6.37 10..42 Raw vocabulary
Among the available data, the replication is accurate compared to Table 1 and Table 2.
I replicated the second analysis “Group comparison distributional learning task.”
Here is a thorough description from the article in Figure 5
As earlier stated, the core function should be glmer() due to the nature of the dependent variable. We don’t have to do the first step as in the available dataset, as fillers and practices were already removed. I first processed several involved categories with as.factor() so the numeric information in Subject wouldn’t interfere with the formula, and variables are ready for further sum-to-zero orthogonal coding.
df_Q1_inf <- df_Q1 %>%
mutate(Condition = as.factor(Condition),
Group = as.factor(Group),
PositionD2 = as.factor(PositionD2),
Subject = as.factor(Subject))
str(df_Q1_inf)
## tibble [400 × 20] (S3: tbl_df/tbl/data.frame)
## $ ...1 : num [1:400] 1 2 3 4 5 6 7 8 9 10 ...
## $ Subject : Factor w/ 50 levels "501","502","503",..: 1 1 1 1 1 1 1 1 2 2 ...
## $ Condition : Factor w/ 2 levels "Condition 1",..: 1 1 1 1 1 1 1 1 2 2 ...
## $ Trial : num [1:400] 2 3 4 6 7 9 10 12 2 3 ...
## $ Item : num [1:400] 1 7 5 6 2 8 3 4 1 2 ...
## $ ACC : num [1:400] 0 0 0 0 0 0 0 0 0 1 ...
## $ StimRef : chr [1:400] "S" "S" "S" "S" ...
## $ StimLeft : chr [1:400] "D2" "D2" "D2" "D1" ...
## $ StimRight : chr [1:400] "D1" "D1" "D1" "D2" ...
## $ Target : chr [1:400] "D2" "D2" "D2" "D2" ...
## $ RT : num [1:400] 2287 3803 2696 3585 2224 ...
## $ AnswerStim : chr [1:400] "D1" "D1" "D1" "D1" ...
## $ PositionD2 : Factor w/ 2 levels "Left","Right": 1 1 1 2 2 2 1 2 1 2 ...
## $ AnswerStimD1: num [1:400] 1 1 1 1 1 1 1 1 0 1 ...
## $ AnswerStimD2: num [1:400] 0 0 0 0 0 0 0 0 1 0 ...
## $ MeanAccPP : num [1:400] 0 0 0 0 0 0 0 0 0.75 0.75 ...
## $ Gender : chr [1:400] "F" "F" "F" "F" ...
## $ Group : Factor w/ 2 levels "DLD","TD": 2 2 2 2 2 2 2 2 2 2 ...
## $ Age_months : num [1:400] 100 100 100 100 100 100 100 100 95 95 ...
## $ Age_ym : chr [1:400] "8;4" "8;4" "8;4" "8;4" ...
It would be handy to print the original levels of these three variables before we set out to do the sum-to-zero orthogonal coding.
print(c(levels(df_Q1_inf$Condition), levels(df_Q1_inf$Group), levels(df_Q1_inf$PositionD2)))
## [1] "Condition 1" "Condition 2" "DLD" "TD" "Left"
## [6] "Right"
The way I do sum coding follows the suggestions in Figure 6. The suggestions were from ChatGPT and the conversation link is provided by one of my mentors for another project.
According to the suggestion, this is what I will do.
contrasts(df_Q1_inf$Condition) <- contr.sum(2)
contrasts(df_Q1_inf$Group) <- contr.sum(2)
contrasts(df_Q1_inf$PositionD2) <- contr.sum(2)
But Figure 5 specified further details on the contrasts. In the
following chunks I followed the instructions to replicate the details.
These are the products of the previous chunk for
Condition
contrasts(df_Q1_inf$Condition)
## [,1]
## Condition 1 1
## Condition 2 -1
Here are the adjusted ones.
contrasts(df_Q1_inf$Condition)[,1] = c(0.5, -0.5)
contrasts(df_Q1_inf$Condition)
## [,1]
## Condition 1 0.5
## Condition 2 -0.5
Same for other variables.
contrasts(df_Q1_inf$Group)
## [,1]
## DLD 1
## TD -1
contrasts(df_Q1_inf$Group)[,1] = c(-0.5, 0.5)
contrasts(df_Q1_inf$Group)
## [,1]
## DLD -0.5
## TD 0.5
contrasts(df_Q1_inf$PositionD2)
## [,1]
## Left 1
## Right -1
contrasts(df_Q1_inf$PositionD2)[,1] = c(0.5, -0.5)
contrasts(df_Q1_inf$PositionD2)
## [,1]
## Left 0.5
## Right -0.5
The processing for Age_months is a bit different, but
the idea is the same.
head(df_Q1_inf$Age_months <- scale(df_Q1_inf$Age_months, scale=FALSE))
## [,1]
## [1,] 2.9
## [2,] 2.9
## [3,] 2.9
## [4,] 2.9
## [5,] 2.9
## [6,] 2.9
Then I write out the model (especially the formula) based on the description in Figure 6.
m_inf_0 <- glmer(AnswerStimD2~Condition*Group*Age_months + Condition*PositionD2 + (1|Subject) + (PositionD2|Subject) , data=df_Q1_inf, family = "binomial")
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues
summary(m_inf_0)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## AnswerStimD2 ~ Condition * Group * Age_months + Condition * PositionD2 +
## (1 | Subject) + (PositionD2 | Subject)
## Data: df_Q1_inf
##
## AIC BIC logLik deviance df.resid
## 447.8 503.7 -209.9 419.8 386
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.9109 -0.5202 -0.2974 0.5533 3.0067
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.5610 0.7490
## Subject.1 (Intercept) 1.2619 1.1233
## PositionD21 0.4114 0.6414 -1.00
## Number of obs: 400, groups: Subject, 50
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.204216 0.263571 -4.569 4.9e-06 ***
## Condition1 1.397168 0.507302 2.754 0.00589 **
## Group1 0.585513 0.487481 1.201 0.22971
## Age_months -0.032793 0.042056 -0.780 0.43554
## PositionD21 0.611823 0.360980 1.695 0.09010 .
## Condition1:Group1 -0.002138 0.969560 -0.002 0.99824
## Condition1:Age_months -0.062258 0.094602 -0.658 0.51047
## Group1:Age_months 0.009085 0.088528 0.103 0.91826
## Condition1:PositionD21 -0.136239 0.629104 -0.217 0.82855
## Condition1:Group1:Age_months -0.342899 0.212337 -1.615 0.10634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Cndtn1 Group1 Ag_mnt PstD21 Cn1:G1 Cn1:A_ Gr1:A_ C1:PD2
## Condition1 -0.183
## Group1 -0.050 -0.004
## Age_months -0.035 0.039 -0.124
## PositionD21 -0.322 0.078 0.140 0.010
## Cndtn1:Grp1 -0.016 -0.017 -0.171 0.196 -0.117
## Cndtn1:Ag_m 0.002 -0.015 0.286 0.088 0.255 -0.218
## Grp1:Ag_mnt -0.140 0.201 0.035 0.223 0.166 -0.032 0.094
## Cndtn1:PD21 0.078 -0.296 -0.087 -0.019 -0.350 0.085 -0.188 -0.119
## Cndt1:G1:A_ 0.125 -0.083 0.166 -0.028 0.344 -0.168 0.437 0.250 -0.258
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## unable to evaluate scaled gradient
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues
Starting from the scratch, I’ve seen two types of warning:
The Estimates from this model doesn’t seem to be correct either. So I searched for some info and added an optimizer based on the suggestions from https://stats.stackexchange.com/questions/164457/r-glmer-warnings-model-fails-to-converge-model-is-nearly-unidentifiable
m_inf <- glmer(AnswerStimD2~Condition*Group*Age_months + Condition*PositionD2 + (1|Subject) + (PositionD2|Subject) , data=df_Q1_inf, family = "binomial", glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)))
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: large eigenvalue ratio
## - Rescale variables?
summary(m_inf)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## AnswerStimD2 ~ Condition * Group * Age_months + Condition * PositionD2 +
## (1 | Subject) + (PositionD2 | Subject)
## Data: df_Q1_inf
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+05))
##
## AIC BIC logLik deviance df.resid
## 447.8 503.7 -209.9 419.8 386
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.9002 -0.5204 -0.2933 0.5531 3.0038
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.6477 0.8048
## Subject.1 (Intercept) 1.1712 1.0822
## PositionD21 0.4132 0.6428 -0.98
## Number of obs: 400, groups: Subject, 50
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.205521 0.262463 -4.593 4.37e-06 ***
## Condition1 1.396511 0.506362 2.758 0.00582 **
## Group1 0.576846 0.489381 1.179 0.23851
## Age_months -0.033863 0.042227 -0.802 0.42259
## PositionD21 0.602816 0.360218 1.673 0.09423 .
## Condition1:Group1 0.006756 0.971582 0.007 0.99445
## Condition1:Age_months -0.065210 0.094284 -0.692 0.48917
## Group1:Age_months 0.007933 0.088645 0.089 0.92870
## Condition1:PositionD21 -0.122365 0.627926 -0.195 0.84549
## Condition1:Group1:Age_months -0.349616 0.209860 -1.666 0.09572 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Cndtn1 Group1 Ag_mnt PstD21 Cn1:G1 Cn1:A_ Gr1:A_ C1:PD2
## Condition1 -0.180
## Group1 -0.051 -0.005
## Age_months -0.035 0.039 -0.124
## PositionD21 -0.313 0.075 0.141 0.010
## Cndtn1:Grp1 -0.014 -0.017 -0.169 0.197 -0.114
## Cndtn1:Ag_m -0.003 -0.014 0.284 0.088 0.250 -0.212
## Grp1:Ag_mnt -0.145 0.204 0.033 0.223 0.164 -0.027 0.084
## Cndtn1:PD21 0.078 -0.284 -0.087 -0.019 -0.347 0.083 -0.183 -0.117
## Cndt1:G1:A_ 0.120 -0.082 0.165 -0.031 0.338 -0.160 0.426 0.241 -0.252
## optimizer (bobyqa) convergence code: 0 (OK)
## Model is nearly unidentifiable: large eigenvalue ratio
## - Rescale variables?
There is still a warning, but the later replication analyses yield same results as the original article results, so I decided to just keep it here.
The first prediction that they made for RQ1 is that school-aged children should demonstrate distributional learning ability, which should “manifest as a significant effect of Condition on the dependent variable.” See Figure 7.
Here are the confirmatory results. See Figure 8.
Their interpretation focuses on
the odds ratio change. To calculate the ORchange for
Condition1, I tried to extract the estimate for
Condition1. See https://difiore.github.io/ada-2024/23-module.html.
Instead of $coefficients, I used fixef() as suggested in https://stackoverflow.com/questions/26417005/odds-ratio-and-confidence-intervals-from-glmer-output.
In the same link, I got a line of code to generate the confidence
intervals.
(ORchange_1 <- exp(fixef(m_inf)[2]))
## Condition1
## 4.041078
CI_1 <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
CI_1 %>%
filter(term == "Condition1") %>%
select(conf.low, conf.high)
## # A tibble: 1 × 2
## conf.low conf.high
## <dbl> <dbl>
## 1 1.50 10.9
Thus, we replicated the results that “children in Condition 1 were 4.04 times more likely to choose stimulus D2 than children in Condition 2, and this effect was significantly above 1: z = 2.758, p = 0.006.” This is in line with their prediction.
For the second condition, I repeated the similar process.
(ORchange_2 <- exp(fixef(m_inf)[6]))
## Condition1:Group1
## 1.006779
CI_2 <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
CI_2%>%
filter(term == "Condition1:Group1") %>%
select(conf.low, conf.high)
## # A tibble: 1 × 2
## conf.low conf.high
## <dbl> <dbl>
## 1 0.150 6.76
Thus we replicated the results that “although the effect of Condition was 1.007 times stronger in the TD group compared to the DLD group, this interaction between Condition and Group was not significantly above 1: z = 0.007, p = 0.994, 95% CI 0.15 .. 6.8.” Their second prediction is not confirmed.
(1/0.1499334) #conf.low
## [1] 6.669628
With this simple calculation we also replicated that “the confidence interval tells us that children with DLD could be up to 6.7 times better or 6.8 times weaker on the visual distributional learning task than TD children.” They couldn’t “conclude whether children with DLD do or do not have a distributional learning deficit.”
Here is a table of summarization for the results for two predictions.
df_Q1_inf_sum <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
df_Q1_inf_sum %>%
filter(term == "Condition1" | term == "Condition1:Group1") %>%
select(term, estimate, p.value, conf.low, conf.high)
## # A tibble: 2 × 5
## term estimate p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Condition1 4.04 0.00582 1.50 10.9
## 2 Condition1:Group1 1.01 0.994 0.150 6.76
According to the replication portion I did, I think the overall process is very successful. The descriptive statistical analysis and two visualizations all match the presentations in the article.
One minor problem from the descriptive statistical analysis is that for many scores, the full data for norm scores are not available (not used in further analyses and thus not included in the data file), so some elements in Table 1 and Table 2 were missing in the replication, resulting in some missing interpretations which are present in the article. Simple raw scores do not allow us to compare between children with DLD to the norms and between areas of abilities.
The biggest problem that I encounter is in the inferential statistical analysis for RQ1, where I have to reconstruct the formula and additional arguments in my code chunk as the formula was not listed and some details were still missing. They didn’t provide the details of using an optimizer (the replicated results indicate they did). I get a random piece of code resorting to stackexchange in response to the warning and mismatched results, and wasn’t confident in that. Although the nature of the assignment determines that formula being unavailable in the article is a good thing, but for replication (including design replication) purpose, it might be beneficial for the researchers to list the formula for others to take as reference and design similar experiments and analysis protocols. Overall, although the formula and other arguments to place in glmer() function wasn’t directly available, the researchers did provide very thorough descriptions on variables and effects, as well as how they apply sum-to-zero orthogonal coding. These details are extremely helpful. The paragraph in Figure 5 provides me with a good example in my future writing. Although I replicated the results presented in the original paper for the single inferential analysis I picked, I am still less confident in using the sum coding as described in the article and the interpretation (of what is set as the reference level).
The replicated results are quite expected from an academic perspective. The profiles of children with DLD could be highly variable. The nature of this diagnosis and the population might explains Figure 4 (widely distributed accuracy ranging from 0-1 in the sample) as well as other non-conclusions from the study.